Skip to content
This repository has been archived by the owner on Nov 1, 2023. It is now read-only.

fix agent retry on connection level failures #623

Merged
8 commits merged into from
Mar 2, 2021

Conversation

bmc-msft
Copy link
Contributor

@bmc-msft bmc-msft commented Mar 2, 2021

In debugging the connection retry issues, I dug into this more.

Apparently, some of hyper's connection errors are not mapped to std::io::Error, rendering the existing downcast impl ineffective.

As such, this PR makes the following updates:

  1. Any request that fails for what reqwest calls a connection error is considered transient.
  2. Updates the retry notify code to use our warn macro such that the events show up in application insights.
  3. Updates the unit test to demonstrate that failures by trying to connect to http://localhost:81/, which shouldn't be listening on any system.
  4. Adds a simple unit test to verify with send_retry_default, connections to https://www.microsoft.com work

Fixes #263

src/agent/reqwest-retry/src/lib.rs Show resolved Hide resolved
src/agent/reqwest-retry/src/lib.rs Show resolved Hide resolved
src/agent/reqwest-retry/src/lib.rs Outdated Show resolved Hide resolved
src/agent/reqwest-retry/src/lib.rs Show resolved Hide resolved
@bmc-msft bmc-msft requested a review from chkeita March 2, 2021 01:31
@@ -107,7 +115,7 @@ pub async fn send_retry_reqwest<
max_elapsed_time: Some(max_elapsed_time),
..ExponentialBackoff::default()
},
|err, _| println!("Transient error: {}", err),
|err, _| warn!("transient http error: {}", err),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add the url in the message here it might be helpful when debugging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, the URL here isn't easily accessible. At this context, we have a Builder, which we'd have to TryClone to get access into the URL.

I agree that it would be valuable. Perhaps a follow-up can refactor this to make it accessible?

@ghost
Copy link

ghost commented Mar 2, 2021

Hello @bmc-msft!

Because this pull request has the auto-merge label, I will be glad to assist with helping to merge this pull request once all check-in policies pass.

Do note that I've been instructed to only help merge pull requests of this repository that have been opened for at least 18 hours, a condition that will be fulfilled in about 16 minutes. No worries though, I will be back when the time is right! 😉

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (@msftbot) and give me an instruction to get started! Learn more here.

@ghost ghost merged commit b970937 into microsoft:main Mar 2, 2021
@bmc-msft bmc-msft deleted the fix-connection-issue-retry branch March 19, 2021 15:42
@ghost ghost locked as resolved and limited conversation to collaborators Apr 18, 2021
This pull request was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Log in agent retries for debugging
4 participants